Towards Scalable Speech Act Recognition in Twitter: Tackling Insufficient Training Data

نویسندگان

  • Renxian Zhang
  • Dehong Gao
  • Wenjie Li
چکیده

Recognizing speech act types in Twitter is of much theoretical interest and practical use. Our previous research did not adequately address the deficiency of training data for this multi-class learning task. In this work, we set out by assuming only a small seed training set and experiment with two semi-supervised learning schemes, transductive SVM and graph-based label propagation, which can leverage the knowledge about unlabeled data. The efficacy of semi-supervised learning is established by our extensive experiments, which also show that transductive SVM is more suitable than graph-based label propagation for our task. The empirical findings and detailed evidences can contribute to scalable speech act recognition in Twitter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

What Are Tweeters Doing: Recognizing Speech Acts in Twitter

Speech acts provide good insights into the communicative behavior of tweeters on Twitter. This paper is mainly concerned with speech act recognition in Twitter as a multiclass classification problem, for which we propose a set of word-based and character-based features. Inexpensive, robust and efficient, our method achieves an average F1 score of nearly 0.7 with the existence of much noise in o...

متن کامل

Role of Monolingualism/Bilingualism on Pragmatic Awareness and Production of Apology Speech Act of English as a Second and Third Language

The present study investigated the pragmatic awareness and production of Iranian Turkish and Persian EFL learners in the speech act of apology. Sixty-eight learners of English studying at several universities in Iran were selected based on simple random sampling as the monolingual and bilingual participants. Data were elicited by means of a written discourse self-assessment/completion test (WDS...

متن کامل

Tweet Acts: A Speech Act Classifier for Twitter

Speech acts are a way to conceptualize speech as action. This holds true for communication on any platform, including social media platforms such as Twitter. In this paper, we explored speech act recognition on Twitter by treating it as a multi-class classification problem. We created a taxonomy of six speech acts for Twitter and proposed a set of semantic and syntactic features. We trained and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012